Overview

Dataset statistics

Number of variables8
Number of observations908
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory56.9 KiB
Average record size in memory64.1 B

Variable types

NUM8

Warnings

df_index has unique values Unique
SM1_Dz(z) has 36 (4.0%) zeros Zeros
NdsCH has 760 (83.7%) zeros Zeros
NdssC has 622 (68.5%) zeros Zeros

Reproduction

Analysis started2022-08-25 09:28:47.774150
Analysis finished2022-08-25 09:29:06.887163
Duration19.11 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct908
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean454.5
Minimum1
Maximum908
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2022-08-25T14:59:07.041068image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile46.35
Q1227.75
median454.5
Q3681.25
95-th percentile862.65
Maximum908
Range907
Interquartile range (IQR)453.5

Descriptive statistics

Standard deviation262.2613201
Coefficient of variation (CV)0.5770326074
Kurtosis-1.2
Mean454.5
Median Absolute Deviation (MAD)227
Skewness0
Sum412686
Variance68781
MonotocityStrictly increasing
2022-08-25T14:59:07.259931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
110.1%
 
62510.1%
 
59910.1%
 
60010.1%
 
60110.1%
 
60210.1%
 
60310.1%
 
60410.1%
 
60510.1%
 
60610.1%
 
Other values (898)89898.9%
 
ValueCountFrequency (%) 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
510.1%
 
ValueCountFrequency (%) 
90810.1%
 
90710.1%
 
90610.1%
 
90510.1%
 
90410.1%
 

CICO
Real number (ℝ≥0)

Distinct502
Distinct (%)55.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.898128855
Minimum0.667
Maximum5.926
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2022-08-25T14:59:07.475798image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.667
5-th percentile1.7754
Q12.347
median2.934
Q33.407
95-th percentile4.16925
Maximum5.926
Range5.259
Interquartile range (IQR)1.06

Descriptive statistics

Standard deviation0.7560884791
Coefficient of variation (CV)0.2608884963
Kurtosis-0.04150551962
Mean2.898128855
Median Absolute Deviation (MAD)0.5285
Skewness0.04545787455
Sum2631.501
Variance0.5716697882
MonotocityNot monotonic
2022-08-25T14:59:07.682670image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2.126141.5%
 
3.08111.2%
 
2.37791.0%
 
2.83470.8%
 
2.0870.8%
 
3.25270.8%
 
2.50870.8%
 
2.47970.8%
 
3.17960.7%
 
2.21660.7%
 
Other values (492)82791.1%
 
ValueCountFrequency (%) 
0.66720.2%
 
0.96530.3%
 
0.97310.1%
 
130.3%
 
1.07510.1%
 
ValueCountFrequency (%) 
5.92610.1%
 
5.15810.1%
 
4.8810.1%
 
4.82910.1%
 
4.8110.1%
 

SM1_Dz(z)
Real number (ℝ≥0)

ZEROS

Distinct186
Distinct (%)20.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6284680617
Minimum0
Maximum2.171
Zeros36
Zeros (%)4.0%
Memory size7.1 KiB
2022-08-25T14:59:07.886124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.134
Q10.223
median0.57
Q30.89275
95-th percentile1.4396
Maximum2.171
Range2.171
Interquartile range (IQR)0.66975

Descriptive statistics

Standard deviation0.4284590914
Coefficient of variation (CV)0.6817515759
Kurtosis-0.117152754
Mean0.6284680617
Median Absolute Deviation (MAD)0.347
Skewness0.6950900326
Sum570.649
Variance0.183577193
MonotocityNot monotonic
2022-08-25T14:59:08.081003image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.22313514.9%
 
0.134748.1%
 
0.405697.6%
 
0.331394.3%
 
0364.0%
 
0.693262.9%
 
0.56252.8%
 
0.496242.6%
 
0.251212.3%
 
0.58161.8%
 
Other values (176)44348.8%
 
ValueCountFrequency (%) 
0364.0%
 
0.134748.1%
 
0.22313514.9%
 
0.251212.3%
 
0.28810.1%
 
ValueCountFrequency (%) 
2.17110.1%
 
2.07110.1%
 
2.04410.1%
 
1.8610.1%
 
1.83410.1%
 

GATS1i
Real number (ℝ≥0)

Distinct557
Distinct (%)61.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.29359141
Minimum0.396
Maximum2.92
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2022-08-25T14:59:08.275882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.396
5-th percentile0.799
Q10.95075
median1.2405
Q31.56225
95-th percentile2.00985
Maximum2.92
Range2.524
Interquartile range (IQR)0.6115

Descriptive statistics

Standard deviation0.394302736
Coefficient of variation (CV)0.3048124261
Kurtosis0.3268399411
Mean1.29359141
Median Absolute Deviation (MAD)0.2995
Skewness0.7231072608
Sum1174.581
Variance0.1554746476
MonotocityNot monotonic
2022-08-25T14:59:08.477757image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.94180.9%
 
1.17970.8%
 
0.93870.8%
 
0.95470.8%
 
0.87170.8%
 
1.18960.7%
 
1.660.7%
 
1.57160.7%
 
1.07750.6%
 
0.9550.6%
 
Other values (547)84493.0%
 
ValueCountFrequency (%) 
0.39610.1%
 
0.42110.1%
 
0.52310.1%
 
0.59510.1%
 
0.61810.1%
 
ValueCountFrequency (%) 
2.9210.1%
 
2.69810.1%
 
2.67210.1%
 
2.60910.1%
 
2.60610.1%
 

NdsCH
Real number (ℝ≥0)

ZEROS

Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2290748899
Minimum0
Maximum4
Zeros760
Zeros (%)83.7%
Memory size7.1 KiB
2022-08-25T14:59:08.643657image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum4
Range4
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.6053349942
Coefficient of variation (CV)2.642520071
Kurtosis13.81346349
Mean0.2290748899
Median Absolute Deviation (MAD)0
Skewness3.400814701
Sum208
Variance0.3664304552
MonotocityNot monotonic
2022-08-25T14:59:08.776574image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
076083.7%
 
110711.8%
 
2293.2%
 
470.8%
 
350.6%
 
ValueCountFrequency (%) 
076083.7%
 
110711.8%
 
2293.2%
 
350.6%
 
470.8%
 
ValueCountFrequency (%) 
470.8%
 
350.6%
 
2293.2%
 
110711.8%
 
076083.7%
 

NdssC
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4856828194
Minimum0
Maximum6
Zeros622
Zeros (%)68.5%
Memory size7.1 KiB
2022-08-25T14:59:08.921484image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.8612789371
Coefficient of variation (CV)1.773336224
Kurtosis6.423355202
Mean0.4856828194
Median Absolute Deviation (MAD)0
Skewness2.239090332
Sum441
Variance0.7418014076
MonotocityNot monotonic
2022-08-25T14:59:09.049405image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
062268.5%
 
117619.4%
 
2818.9%
 
3182.0%
 
480.9%
 
620.2%
 
510.1%
 
ValueCountFrequency (%) 
062268.5%
 
117619.4%
 
2818.9%
 
3182.0%
 
480.9%
 
ValueCountFrequency (%) 
620.2%
 
510.1%
 
480.9%
 
3182.0%
 
2818.9%
 

MLOGP
Real number (ℝ)

Distinct559
Distinct (%)61.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.109285242
Minimum-2.884
Maximum6.515
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2022-08-25T14:59:09.228293image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-2.884
5-th percentile-0.317
Q11.209
median2.127
Q33.105
95-th percentile4.48745
Maximum6.515
Range9.399
Interquartile range (IQR)1.896

Descriptive statistics

Standard deviation1.433180788
Coefficient of variation (CV)0.6794627676
Kurtosis0.006986919075
Mean2.109285242
Median Absolute Deviation (MAD)0.937
Skewness-0.03519131625
Sum1915.231
Variance2.054007172
MonotocityNot monotonic
2022-08-25T14:59:09.427170image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1.701101.1%
 
0.8101.1%
 
0.20291.0%
 
1.06491.0%
 
2.60491.0%
 
1.74891.0%
 
1.44280.9%
 
2.19380.9%
 
1.58780.9%
 
1.85970.8%
 
Other values (549)82190.4%
 
ValueCountFrequency (%) 
-2.88410.1%
 
-2.08910.1%
 
-2.0310.1%
 
-1.9610.1%
 
-1.35810.1%
 
ValueCountFrequency (%) 
6.51510.1%
 
6.20310.1%
 
6.16610.1%
 
5.93410.1%
 
5.74110.1%
 

LC50[-LOG(mol/L)]
Real number (ℝ≥0)

Distinct827
Distinct (%)91.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.064430617
Minimum0.053
Maximum9.612
Zeros0
Zeros (%)0.0%
Memory size7.1 KiB
2022-08-25T14:59:09.903875image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.053
5-th percentile1.68385
Q13.15175
median3.9875
Q34.9075
95-th percentile6.48365
Maximum9.612
Range9.559
Interquartile range (IQR)1.75575

Descriptive statistics

Standard deviation1.455698446
Coefficient of variation (CV)0.3581555655
Kurtosis0.6638471981
Mean4.064430617
Median Absolute Deviation (MAD)0.867
Skewness0.2521384377
Sum3690.503
Variance2.119057965
MonotocityNot monotonic
2022-08-25T14:59:10.104750image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.20840.4%
 
3.51340.4%
 
3.4730.3%
 
3.97930.3%
 
3.92630.3%
 
3.6630.3%
 
4.5420.2%
 
3.75120.2%
 
3.84120.2%
 
4.49920.2%
 
Other values (817)88096.9%
 
ValueCountFrequency (%) 
0.05310.1%
 
0.1510.1%
 
0.24210.1%
 
0.3310.1%
 
0.36110.1%
 
ValueCountFrequency (%) 
9.61210.1%
 
9.35410.1%
 
8.91610.1%
 
8.60410.1%
 
8.57110.1%
 

Interactions

2022-08-25T14:58:54.921558image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:55.085842image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:55.245689image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:55.400977image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:55.558649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:55.741583image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:55.939475image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:56.095185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:56.257166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:56.420442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:56.579318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:56.739420image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:56.902774image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:57.079238image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:57.242578image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:57.398079image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:57.557826image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:57.742952image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:57.904967image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:58.064129image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:58.222032image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:58.420908image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:58.580810image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:58.739709image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:58.902608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:59.066506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:59.232404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:59.395303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:59.561200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:59.737091image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:58:59.902988image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:00.066887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:00.232783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:00.409673image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:00.587565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:00.977324image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:01.157212image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:01.343097image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:01.519987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:01.693879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:01.876766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:02.037666image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:02.200565image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:02.362465image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:02.526361image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:02.701254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:02.866152image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:03.026053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:03.221931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:03.397821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:03.566718image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:03.738611image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:03.896513image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:04.073404image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:04.236303image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:04.386209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:04.544112image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:04.713007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:04.885913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:05.047799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:05.217694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:05.409573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:05.584466image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:05.750363image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-08-25T14:59:10.264651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-25T14:59:10.479518image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-25T14:59:10.681393image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-25T14:59:10.884267image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-25T14:59:06.038185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-25T14:59:06.322007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Sample

First rows

df_indexCICOSM1_Dz(z)GATS1iNdsCHNdssCMLOGPLC50[-LOG(mol/L)]
013.2600.8291.676011.4533.770
122.1890.5800.863001.3483.115
232.1250.6380.831001.3483.531
343.0270.3311.472101.8073.510
452.0940.8270.860001.8865.390
563.2220.3312.177000.7061.819
673.1790.0001.063002.9423.947
783.0000.0000.938102.8513.513
892.6200.4990.990002.9424.402
9102.8340.1340.950001.5913.021

Last rows

df_indexCICOSM1_Dz(z)GATS1iNdsCHNdssCMLOGPLC50[-LOG(mol/L)]
8988993.5990.7021.514213.2476.183
8999002.9860.9611.669041.7983.152
9009012.8041.1100.618061.3176.254
9019023.6700.7282.110032.2882.964
9029033.4750.4050.875123.1484.803
9039042.8010.7282.226020.7363.109
9049053.6520.8720.867233.9834.040
9059063.7630.9160.878062.9184.818
9069072.8311.3931.077010.9065.317
9079084.0571.0321.183134.7548.201